M. Fawcett December 23, 2021
Between 1908 and 1942, the Sears Roebuck company sold houses in the form of build-it-yourself kits. The kits, huge, were prepared at factories located mostly in Illinois, and shipped via a railroad freight car to customers all over the country. Each kit contained all the materials needed to build a house except the foundation hole. Customers lugged their 25 tons of numbered precut lumber, shingles, wall board, flooring and so on, to a building site, and got to work, following the instructions in the 75 page construction guide.
This unlikely business concept was successful and resulted in sales of between 70,000 and 100,000 kits. Plus, as a mail-order marketer of all manner of merchandise, Sears sold tools used to build the houses, and then the appliances, furniture and fixtures that filled them when they were completed.
Their success was cut short when the Great Depression and World War II took away most of the demand for new housing. After World War II a new form of housing, tract housing, took over and the kit home business faded into oblivion. The Sears company itself has lately been fading into oblivion. Having declared bankruptcy in 2018, Sears barely exists at all now except in legal proceedings and a few remaining stores as its assets are slowly liquidated.
With no official list of where kit homes were built, a few fascinated enthusiasts hunt them down and share their findings through social media and Websites.
This report discusses where kit homes are located and offers clues as to where they are likely to be be found. It will look at things like street name, building site distance from railroads, economic factors and population characteristcs of areas.
My original goal was to create a computer program that could analyze a picture of a house and tell you if it was a Sears kit home, and moreover, its model name. This turns out to be a very hard problem due to the large number of models (around 370) that were produced over the years. Another goal was to have a computer program "crawl" through Google Street View images of houses and identify ones that had a high likelihood of being Sears kit homes. This turns out to be economically infeasible because Google charges 7/10ths of a cent every time my computer program used Street View to capture an image. Scanning 10,000 images of houses cost \$70.00.
Scanning 1 million images would have cost \$7,000.00.
This computer program was written in the Python language. Another software package called QGIS was used to prepare some of the data displayed in the maps. A list (provided by Lara) of around 13,000 confirmed and not-yet-confirmed kit home locations forms the basis of this analysis.
# Load Python modules needed for the analysis
import pandas as pd # for dataframe manipulation
import numpy as np # for numerical analysis
import matplotlib.pyplot as plt # for generating plots and graphs
from matplotlib.pyplot import figure # for modifying appearance of plots & graphs
import requests # to make http post requests to the US Census geocoder
import io # for working with I/O streams and allow conversion of geocode response to dataframe
import csv # reading/writing csv files
import pickle as pk # to store and retrieve dataframes on disk
import csv # to read text files
import requests # to make http requests for data using census web API
import os # to list contents of disk drive folders
import folium # the mapping package
from folium import plugins # to allow cluster markers on maps
# Settings to improve the display of tabular results
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
This part of the analysis will be an interactive map of the US that shows the location of each kit home identified by hunters up through October 23, 2021.
# Read the Excel file of kit home locations into a Pandas dataframe.
address_df = pd.read_excel(r"Sears Roebuck Houses.xlsx", sheet_name = "Locations")
# Add a row number to each address. A unique number for each row will be needed by the
# US Census Bureau geocoder
address_df.insert(loc=0, column='row_num', value=np.arange(len(address_df)) + 2)
# the +2 is to add 2 to each row number to account for the header row and row "0".
# I want the row_num value to be aligned wth the row number in the original Excel file.
# Remove the "?" from the Auth? column name.
address_df.rename(columns={"Auth?": "Auth"}, inplace = True)
# Examine some of the data
address_df.head()
# Total number of locations
n = len(pd.unique(address_df["Address"]))
print ("Number of locations:", f'{n:,}')
# Tidy up the values in the "Auth" column
# Change the "nan" to "N/A".
address_df["Auth"] = address_df["Auth"].replace(np.nan, 'N/A', regex=True)
# Make all the values in the Auth? column uppercase
address_df["Auth"] = address_df["Auth"].apply(lambda x: x.upper())
# Count the number of responses for Auth?
address_df.groupby("Auth").size()
state_count = address_df['State'].value_counts()
# Plot a barchart
figure(figsize=(16, 6))
state_count.plot.bar()
plt.title("Number of Locations by State")
plt.show()
Ohio has the most kit home locations followed by Illinois, Pennsylvania and New York. Every state has at least one possible kit home location.
The next step (computer code omitted) is to "geocode" each of the addresses. This involves submitting an address to the US Census Bureau's geocoding service and receiving the address's longitude and latitude, census tract number and Zip Code. This takes around 20 minutes for the entire list of 13,000 addresses. The geocoding results get stored in a computer file so the process doesn't need to run again.
Once the coordinates of the addresses are known, the mapping process can begin.
# Retrieve the coordinates and other results of the geocoding that were previously stored in a computer file.
geocoded_results_df = pd.read_pickle('geocoded_results.pkl')
# Only keep rows that were successfully geocoded
geocoded_results_df = geocoded_results_df[geocoded_results_df["MATCH_INDICATOR"] == "Match"]
# Convert geography code values from numeric to string
geocoded_results_df['FIPS_STATE'] = geocoded_results_df['FIPS_STATE'].astype(int).astype(str)
geocoded_results_df['FIPS_COUNTY'] = geocoded_results_df['FIPS_COUNTY'].astype(int).astype(str)
geocoded_results_df['CENSUS_TRACT'] = geocoded_results_df['CENSUS_TRACT'].astype(int).astype(str)
# Left pad geograpgy values wit zeros
geocoded_results_df['FIPS_STATE'] = geocoded_results_df['FIPS_STATE'].apply('{:0>2}'.format)
geocoded_results_df['FIPS_COUNTY'] = geocoded_results_df['FIPS_COUNTY'].apply('{:0>3}'.format)
geocoded_results_df['CENSUS_TRACT'] = geocoded_results_df['CENSUS_TRACT'].apply('{:0>6}'.format)
# Create a unique geographic identifier by combining state, county and cenus tract code for each row.
geocoded_results_df["GeoID"] = geocoded_results_df["FIPS_STATE"] \
+ geocoded_results_df["FIPS_COUNTY"] \
+ geocoded_results_df["CENSUS_TRACT"]
# Split the LONG_LAT column into separate Longitude and Latitude columns
geocoded_results_df[['Longitude', 'Latitude']] = geocoded_results_df['LONG_LAT'].str.rsplit(',', 1, expand=True)
# Merge the Sears Kit Home style for each location from the original address list with the geocoded results.
mapping_data_df = pd.merge(left = address_df[['row_num','Model','Address','City','State','Auth']],
right = geocoded_results_df,
how = 'right',
left_on = 'row_num',
right_on = 'ID')
# Examine some of the results
mapping_data_df.head()
In Census Bureau speak, a GEOID is a string of numbers that uniquely identify the state, county and census tract number of a location. It is used to find out all kinds of information about an area such as population, income, jobs, ages and much more.
# Build a list containing all the coordinates so they be plotted on the map
locations = mapping_data_df[['Latitude', 'Longitude']]
locationlist = locations.values.tolist()
len(locationlist)
# An example of one point in the kit home location list
locationlist[7]
The next computer code generates a USA map that shows the CONFIRMED (authenticated) kit home locations with DARK BLUE markers and the UNCONFIRMED (not authenticated) kit home locations with LIGHT BLUE markers.
Using the Map...
The thing that looks like a stack of square pancakes in the upper right corner of the map is a "layer control". You can use it to hide or show the confirmed and unconfirmed locations.
The + and - in the upper left of the map lets you zoom in and out.
Clicking on a numbered marker zooms in and separates a bigger group into smaller groups.
Once you zoom in far enough you will see individual markers that tag a single location. These are the markers with a little "i" in the center. If you click on one of these you will see the address. Something that is kind of fun to do is to highlight the address (just the address), right-click the highlight, and select "Search with Google". In most cases it will bring up a Street View page for the house. There is no charge for this sort of use of Street View.
{
"tags": [
"hide_input",
]
}
# Create a map using the Map() function and the coordinates of the locations of all the homes.
# Map starts out centered on Ohio.
mp = folium.Map(location=[40.367474, -82.996216], zoom_start=7, width=900, height=550, control_scale=True)
# Ohio_map
### Define functions to set the color of cluster markers. Confirmed and unconfirmed locations have
### different colors.
# This sets the color for CONFIRMED locations clusters.
icon_create_function_confirmed = """
function(cluster) {
var childCount = cluster.getChildCount();
/*
// comment: can have something like the following to modify the different cluster sizes....
var c = ' marker-cluster-';
if (childCount < 50) {
c += 'large';
} else if (childCount < 300) {
c += 'medium';
} else {
c += 'small';
}
// The marker-cluster-<'size'> gets passed in the "return new L.DivIcon()" function below.
*/
return new L.DivIcon({ html: '<div><span style="background-color:darkblue;color:white;font-size: 20px;">' + childCount + '</span></div>', className: 'marker-cluster', iconSize: new L.Point(40, 30) });
}
"""
# This sets the color of UNCONFIRMEDlocation clusters.
icon_create_function_unconfirmed = """
function(cluster) {
var childCount = cluster.getChildCount();
return new L.DivIcon({ html: '<div><span style="background-color:lightblue;color:black;font-size: 20px;">' + childCount + '</span></div>', className: 'marker-cluster', iconSize: new L.Point(40, 30) });
}
"""
# Feature groups allow customization of layer control labels so they don't have to say "macro blah...""
fg_confirmed = folium.FeatureGroup(name = 'Confirmed Locations', show = True)
mp.add_child(fg_confirmed)
fg_unconfirmed = folium.FeatureGroup(name = 'Unconfirmed Locations', show = True)
mp.add_child(fg_unconfirmed)
# Add the Marker clusters for confirmed and unconfirmed locations to feature group
marker_cluster_confirmed = plugins.MarkerCluster(icon_create_function = icon_create_function_confirmed).add_to(fg_confirmed)
marker_cluster_unconfirmed = plugins.MarkerCluster(icon_create_function=icon_create_function_unconfirmed).add_to(fg_unconfirmed)
# A function to choose a marker color depending on if the house is a confirmed kit house or not.
# The individual location markers use the same color as their cluster markers.
def getcolor(auth_val):
if auth_val == 'YES':
return ("darkblue", "Confirmed")
return ("lightblue","Unconfirmed")
### Add a layer to the map shpwing Confirmed kit homes
# Loop through all ther location pairs.
for point in range(0, len(locationlist)):
try:
clr, status = getcolor(mapping_data_df["Auth"][point])
if status == "Confirmed":
folium.Marker(
location = locationlist[point],
popup = status + " " + mapping_data_df['Model'][point] + ": " + mapping_data_df['ADDRESS_OUT'][point],
icon = folium.Icon(color = clr)
).add_to(marker_cluster_confirmed)
except Exception: # not all addresses could be geocoded so skip them if coordinates are missing
pass
### Add a layer to the map showing Unconfirmed kit homes
for point in range(0, len(locationlist)):
try:
clr, status = getcolor(mapping_data_df["Auth"][point])
if status == "Unconfirmed":
folium.Marker(
location = locationlist[point],
popup = status + " " + mapping_data_df['Model'][point] + ": " + mapping_data_df['ADDRESS_OUT'][point],
icon = folium.Icon(color = clr)
).add_to(marker_cluster_unconfirmed)
except Exception: # not all addresses could be geocoded so skip them if coordinates are missing
pass
# add layer control to map (allows layer to be turned on or off)
folium.LayerControl().add_to(mp)
# Display the map
mp